MXPlank - The Elegant Universe Of Space-Time

Connecting To The Server To Fetch The WebPage Elements!!....

MXPlank.com

Submit Research Thesis

Electronics - MicroControllers

ScienceCasts

Earth's Magnetosphere

Elucidating The Black Holes

The Surprising Power of a Solar Storm

A Close Encounter With Jupiter

Ancient remnants deep in the Kuiper belt

The Super Fluid Core Of A Dead Neutron Star

Massive Cloud On Collision Course With Milky Way

Mysterious Objects at the Edge of the Electromagnetic Spectrum

Big Mystery in the Perseus Cluster

Spacecraft discovers thousands of doomed comets

Close Encounter with Enceladus

The Sounds Of The InterStellar Space

Search The Site

The choice and number of output nodes is also tied to the activation function, which inturn depends on the application at hand. For example, if k-way classification is intended,k output values can be used, with a softmax activation function with respect to outputs ν̄=[ν₁....,ν_k]at the nodes in a given layer. Specifically, the activation function for theithoutput is defined as follows:

It is helpful to think of these k values as the values output by k nodes, in which the in-puts are ν₁...ν_k. An example of the softmax function with three outputs is illustrated in Figure1.9, and the valuesν₁,ν₂and ν₃ are also shown in the same figure. Note that the three outputs correspond to the probabilities of the three classes, and they convert the three outputs of the final hidden layer into probabilities with the soft max function. The final hidden layer often uses linear (identity) activations, when it is input into the softmax layer. Furthermore, there are no weights associated with the softmax layer, since it is only converting real-valued outputs into probabilities.

The use of softmax with a single hidden layer of linear activations exactly implements a model, which is referred to as multinomial logistic regression. Similarly, many variations like multi-class SVMs can be easily implemented with neural networks. Another example of a case in which multiple output nodes are used is the auto encoder, in which each input data point is fully reconstructed by the output layer.The auto encoder can be used to implement matrix factorization methods like singular value decomposition. This architecture will be discussed in detail in later posts. The simplest neural networks that simulate basic machine learning algorithms are instructive because they lie on the continuum between traditional machine learning and deep networks. By exploring these architectures, one gets a better idea of the relationship between traditional machine learning and neural networks, and also the advantages provided by the latter.